Load balancing in a data delivery system

ABSTRACT

The time taken for connection establishment is monitored to aid in selecting load distribution among nodes in a data delivery system, such as a server cluster. The failure of a node to respond to a connection request may be used to identify a crashed node. The number of connections being maintained and the amount of bandwidth being consumed may also be monitored for each node, and this information may be used to determine when a node should be removed from contention for new connection requests and when a node should be reinstated to receive new connection requests.

BACKGROUND

[0001] The present application describes systems and techniques relatingto load balancing in a data delivery system, for example, balancingserver load in a server cluster.

[0002] A network is a collection of nodes coupled together with wired orwireless communication links, such as coax cable, fiber optics or radiofrequency bands. Each node is capable of communicating with other nodesover the communication links using networking protocols. A node may beany machine capable of communicating using the network protocol. Aninter-network is a collection of computer networks coupled together byrouters (also known as gateways) and an inter-networking protocol.

[0003] A server cluster is a group of independent data servers, ornodes, coupled with a network and managed as a unified data deliverysystem. The servers in the cluster cooperate in providing data torequesting client devices. A load balancer or redirector may be used todistribute client requests across the servers within the cluster.Distributing load across different servers allows the server cluster tohandle large numbers of concurrent requests and large volumes ofrequested data, while keeping response time to a minimum.

[0004] A load balancer commonly performs two interrelated functions: (1)server selection and (2) request translation. Request translation refersto the conversion of a client request directed to the server clustergenerally, into a specific request directed to an individual serverwithin the cluster. Server selection involves choosing an individualserver to process a particular client request.

[0005] A common example of a server cluster is a web server cluster. Aweb server provides users with access to data, which is typically in theform of HTML (Hypertext Markup Language) documents and softwareorganized into a web site. Each web server in a conventional web servercluster typically stores identical web site content and typically runsmirroring software to maintain the duplicate content across all theservers in the cluster.

[0006] With regard to request translation, traditional approaches toload balancing in a web server cluster include a Domain Name Server(DNS) approach and a reverse proxy approach. In the DNS approach, alocal DNS distributes server load by dynamically resolving a domain namefor the cluster into different Internet Protocol (IP) addresses for theweb servers in the cluster. In a reverse proxy approach, a reverse proxyserver translates relative Universal Resource Locators (URLs) in clientrequests into absolute URLs addressed to one of the web servers in thecluster.

[0007] With regard to server selection, a traditional approach, referredto as “round robin,” assigns each server in the cluster a position in alogical circle. As each new client request comes in, the load balancerdirects the request to the server associated with the next position inthe logical circle.

BRIEF DESCRIPTION OF THE FIGURES

[0008]FIG. 1 is a block diagram illustrating an operational environmentfor a server load balancing system.

[0009]FIGS. 2A and 2B are timing diagrams illustrating TransmissionControl Protocol (TCP) connection establishment in an IP networkemploying server load balancing.

[0010]FIGS. 3A and 3B are state diagrams illustrating respectively aprocedure for monitoring load on servers in a server cluster and foradjusting assignments of connection requests.

[0011]FIGS. 3C, 3D and 3E are logic flow diagrams illustratingadditional details regarding the states and transitions of FIGS. 3A and3B.

[0012]FIG. 4 is a block diagram illustrating an example computingenvironment.

[0013] Details of one or more embodiments are set forth in theaccompanying drawings and the description below. Other features andadvantages may be apparent from the description and drawings, and fromthe claims.

DETAILED DESCRIPTION

[0014] The systems and techniques described here relate to server loadbalancing. The description that follows discusses server load balancingin the context of IP, but may apply equally in other contexts, forexample, any network protocol that uses a connection establishmentprotocol having a connection request message and a correspondingacknowledgement message. As used herein, the term “message” includes anydiscrete block of data transmitted over a network, including IPsegments, IP packets, and Ethernet frames.

[0015] The present inventors developed load-balancing systems andtechniques that may better assess server loading and may aid inidentifying when a server has crashed. For example, the amount of timetaken by each node in a data delivery system to respond to a newconnection request, as well as the number of connections beingmaintained and/or the amount of bandwidth being consumed by each node,may be monitored to determine when a node should be removed fromcontention for new connection requests and when that node should bereinstated to receive new connection requests. Implementations of theload balancing systems and techniques may include various combinationsof the features described below.

[0016]FIG. 1 is a block diagram illustrating an operational environmentfor a server load balancing system. The server load balancing systemincludes a server cluster 100, which includes two or more data servers.These data servers may be any machines capable of responding toconnection requests received through a network by delivering data. Forexample, the servers in the server cluster 100 may be web servers.

[0017] The server cluster 100 receives connection requests from multipleaccess devices 105, which are coupled with an inter-network 110, such asthe Internet. A link 115 connects the inter-network 110 with a networksystem 120, which contains the server cluster 100.

[0018] The network system 120 includes one or more networks 130, 140.The network system 120 may be an autonomous system, such as those in theInternet, which are typically managed by a single entity such as acompany, educational institution, government agency, etc. At eachingress/egress point in the network system 120, such as the link 115, aload monitor 125 is attached. Each load monitor 125 monitors networktraffic directed to or coming from the server cluster 100. Each loadmonitor 125 may also report server-loading information to a controller135. Alternatively, each load monitor 125 reports server-loadinginformation to all other load monitors.

[0019] In the example of FIG. 1, only one ingress/egress point is shown,namely link 115. Thus, only one load monitor 125 is used. However, otherimplementations could have multiple ingress/egress points and could usemultiple monitors. In addition, in some configurations, a single loadmonitor 125 may be used with multiple ingress/egress points. Forexample, if the server cluster 100 is coupled with a single network 140within the network system 120, then a single load monitor 125 may beconnected with the network 140, and no other load monitors may beneeded. However, in certain implementations, at least two load monitorsare always provided, because load monitors are provided in pairs, withthe second of each pair serving as a backup in case of failure of thefirst (e.g., hardware failure, software failure, power source failure,etc.). Alternatively, each load monitor is constructed with redundantsystems to provide backup functionality.

[0020] The network system 120 may consist only of a single load monitor,a single network and two servers. Alternatively, the network system 120may include many networks, many servers functioning as a virtual servercluster (i.e. the servers within the virtual server cluster are not allconnected to the same network but rather are distributed among two ormore different networks), and many ingress/egress points to theinter-network 110, with each ingress/egress point having its own loadmonitor.

[0021] Devices inside the network system 120 may also access the servercluster 100. These devices may send messages to the server cluster 100that do not pass through a load monitor. For example, the controller 135may have access to the server cluster 100 independently of a data paththrough a load monitor. Thus, in some implementations, it is unnecessarythat all network traffic directed to the server cluster 100 pass througha load monitor.

[0022] In addition to monitoring server cluster traffic, the loadmonitor 125 may also perform server assignment. For example, the network140 may be an Ethernet or token ring network, and the load monitor 125may be the only link between the network 140 and all other networks. Inthis case, the load monitor 125 functions, in part, as a router.

[0023] In this example, the load monitor 125 identifies messagesdirected to the server cluster 100, selects a server in the servercluster 100 for a new connection or identifies the appropriate server inthe cluster 100 for an existing connection, and routes the messages tothe appropriate servers in the cluster 100 over the Ethernet 140. Theindividual servers within the server cluster 100 are unknown to devicesother than the load monitor 125. For data requesting devices, the servercluster 100 appears to be a single logical machine with a single IPaddress.

[0024] Alternatively, the load monitor 125 reports server-loadinginformation to another device within the network system 120, such as thecontroller 135 or other machine (e.g., a DNS server or a reverse proxyserver), which performs the load balancing based upon the loadinginformation provided. One or more controllers 135 may also be providedto configure one or more load monitors 125 by distribution of policies,and to receive network statistics from one or more load monitors 125.For example, in a large network system 120, multiple controllers 135 maycontrol multiple load monitors 125 and may communicate among themselvesin order to gain knowledge regarding the functioning of the entirenetwork system 120 (e.g., information regarding bandwidth capacity,network topology, etc.).

[0025]FIGS. 2A and 2B are timing diagrams 200, 250 illustratingTransmission Control Protocol (TCP) connection establishment in an IPnetwork employing server load balancing. FIG. 2A shows a standard IP“three-way handshake” with no time-outs. A client 210 sends asynchronization message 220 to a server cluster 215 requesting aconnection. In IP, the synchronization message 220 is known as a SYNsegment because the segment has a SYN flag set, indicating a desire onthe part of the client 210 to synchronize sequence numbers, therebyestablishing a connection; a SYN segment also includes the client'sinitial sequence number (ISN).

[0026] The server cluster 215 responds by sending an acknowledgementmessage 222. The acknowledgement message 222 has an acknowledgement(ACK) flag set and contains the client's ISN plus one. Theacknowledgement message 222 is also a SYN segment containing the servercluster's ISN.

[0027] A load monitor 205 measures a time 230 between thesynchronization message 220 and the acknowledgement message 222. Themanner in which the load monitor measures the time 230 may vary withdifferent implementations. For example, the time 230 may be thedifference between a time at which the load monitor 205 sees thesynchronization message 220 and a time at which the load monitor 205sees the acknowledgement message 222. Alternatively, the time 230 may bemeasured using one or more time stamps within the messages.

[0028] Once the client 210 receives the acknowledgement message 222, theclient 210 sends a reply acknowledgement message 224 to the servercluster 215. The reply acknowledgement message 224 has the ACK flag setand contains the server cluster's ISN plus one. The sequence numbers areused by the client 210 and the server cluster 215 to identify theestablished connection. If packets get delayed in the network, they willnot be misinterpreted as part of a later connection.

[0029] The ISN for the server cluster 215 may or may not be for aspecific server 215 a within the cluster 215. For example, theconnection request 220 may result in an assignment of a server 215 a,and that server 215 a may then communicate directly with the client 210through the network, using an ISN generated by the server 215A.

[0030] Alternatively, the connection request 220 may result in anassignment of a server 215 a by the load monitor 205, and that server215 a may then communicate with the client 210 through the load monitor205. In this later case, the load monitor creates the ISN to be sent tothe client 210 in the acknowledgement message 222 and handles anynecessary translation of sequence numbers sent by the server 215 a intothe sequence numbers sent to the client 210. In some cases, the server215 a may not provide sequence numbers (e.g., if the server cluster 215resides on a single Ethernet network and the load monitor 205 handlesall communications with requesting clients, then server sequence numbersneed not be used).

[0031]FIG. 2B shows an IP “three-way handshake” with a single time-out(by both a client 260 and a load monitor 255) in a timing diagram 250.The client 260 sends a synchronization message 270 to a server cluster265 requesting a connection. The server cluster fails to respond withina time-out period 280.

[0032] In this example, both the client 260 and the load monitor 255 areusing the same time-out period. However, various time-out periods andcombinations of time-out periods may be used. For example, the client260 may use three time-out periods with three synchronization messagesin attempting to establish a connection (a first time-out period (e.g.,six seconds) measured from an initial request, a second time-out period(e.g., twelve seconds) measured from the end of the first time-outperiod, and a third time-out period (e.g., seventy-five seconds)measured from the initial request), while the load monitor 255 may use asingle time-out period that is much shorter (e.g., 500 milliseconds).

[0033] The client 260 sends a second synchronization message 272 to theserver cluster 265 requesting a connection. The server cluster 265responds with an acknowledgement message 274. The load monitor 255measures a time 282 between the second synchronization message 272 andthe acknowledgement message 274, as described previously. Once theclient 260 receives the acknowledgement message 274, the client 260sends a reply acknowledgement message 276 back to the server cluster265.

[0034] The synchronization message 270 and the second synchronizationmessage 272 may or may not be received by the same server within theserver cluster 265. For example, when the load monitor 255 determinesthat an assigned server 265 a has failed to respond to a connectionrequest within the time-out period 280, the load monitor 255 may causethe server 265 a to be removed from contention for future connectionrequests. In that case, when the second synchronization message 274arrives, a different server, e.g., server 265 b, will respond.

[0035] Moreover, the assigned server 265 a may be taken out of aconnection at other times. For example, if the time-out period 280, usedby the load monitor 255, is much shorter than a time-out period used bythe client 260, the synchronization message 270 may result in a timeoutfor a server well before the client timeout. The load monitor 255 maythen transmit a new connection request to a different server, and thisdifferent server then replies to the client 260 before the client'stime-out period expires. Thus, the client's single request results intwo requests being received by two separate servers.

[0036] Alternatively, the load monitor 255 may determine that a server265 a, which is in established communication with a client 260, isbecoming overloaded or has crashed, and reassign the existing connectionto another server 265 b. The load monitor 255 may assess overloading ofa server using one or more factors, including the server's responsetime, the number of connections the server is managing, and the amountof bandwidth the server's connections are consuming.

[0037]FIGS. 3A and 3B are state diagrams illustrating a procedure formonitoring load on servers in a server cluster and for adjustingassignments of connection requests. FIG. 3A illustrates a procedureperformed by an example load monitor to track load on servers in aserver cluster and to generate messages to cause rebalancing of serverload. FIG. 3B illustrates a procedure performed by an example connectionassignor to assign new connection requests to the servers based upon themessages received from the load monitor.

[0038] The connection assignor may be a software process executing in aload monitor or in an entirely separate device. Although FIGS. 3A and 3Bdepict the procedures for monitoring server loads and for adjustingconnection request assignments separately, they need not be performedseparately but rather may be performed in the same procedure. A singleload monitor may perform all of the functions described here, includingassignment of new connection requests.

[0039] Referring now to FIG. 3A, state SO is an initial state for theload monitor. When a client message is identified, a transition is madefrom state S0 to state S1. In state S1, the client message is input datafor the next transition. If the client message is a connection request(e.g., a SYN segment in IP), a server response timer is started for thisconnection request, and a transition is made back to state S0.

[0040] If the client message is a connection termination (e.g., a FIN(finished) segment in IP), a current state of a server handling thisconnection is checked, a rebalancing message may be sent, and atransition is made back to state S0. Any other type of client messageresults in a transition back to state S0 with no processing or output.The processes and outputs of state S1 are described in greater detailbelow in connection with FIG. 3C.

[0041] When a server message is intercepted while the load monitor is instate S0, a transition is made to state S2. In state S2, the servermessage is input data for the next transition. If the server message isa connection acknowledgment (e.g., a SYN-ACK segment in IP), aconnection count is incremented and checked, the server response timeris checked, a rebalancing message may be sent, and a transition is madeback to state S0. If the server message contains any other data, acurrent bandwidth consumption is calculated, and a transition is madeback to state S0. The processes and outputs of state S2 are described ingreater detail below in connection with FIG. 3D.

[0042] When a timer expires while the load monitor is in state S0, atransition is made to state S3. In state S3, a check is made for anyoutstanding connection requests that have not been responded to in atimely fashion, a check is made for excessive bandwidth consumption, arebalancing message may be sent, and a transition is made back to stateS0. The timer that expired may be a programmed interrupt timer, such asthe server response timer, and/or it may be a periodic interrupt timerthat causes state S3 to be entered regularly to make checks includingchecking a set of server response timers, which may be memory locationsstoring connection request times.

[0043] Referring now to FIG. 3B, state 0 is an initial state for theconnection assignor. When a connection request is received from aclient, a server is assigned to the connection request, and a transitionis made back to state 0. When a rebalancing message is received from theload monitor, assignment details are adjusted based upon the rebalancingmessage.

[0044] For example, if servers are assigned in a round robin fashion,the rebalancing message may instruct the connection assignor to remove aparticular server from the circle for a specified period of time oruntil a later rebalancing message indicates otherwise. Thus, arebalancing message can cause a server to be taken offline for newconnections temporarily or indefinitely.

[0045]FIGS. 3C, 3D and 3E are logic flow diagrams illustrating detailsof the states and transitions of FIGS. 3A and 3B.

[0046]FIG. 3C is a logic flow diagram illustrating the processes andoutputs of state S1. The process begins at block 300, in which a clientmessage is checked to determine if it is a connection request. If so,control passes to block 302. If not, control passes to block 304.

[0047] In block 302, a server response timer for this connection requestis started. This timer may be implemented as a time stamp stored in avariable or memory location for later checking. Alternatively, the timermay be an actual interrupt timer, which interrupts the other processesafter a prescribed period. Following block 302, the process ends.

[0048] In block 304, a check is made to determine if the client messageis a connection termination message. If not, the process ends. If so,control passes to block 306.

[0049] Although connection termination is described here as occurringupon the request of the client, other scenarios are possible. Forexample, in a full duplex connection, such as TCP, a server considers aconnection to be open even after the client has terminated itstransmissions. In that case, a server termination message would need tobe intercepted to trigger the processes shown in blocks 304312.

[0050] In block 306, a connection counter is decremented for the serverhandling the client that sent the connection termination request. Then,in block 308, a check is made to determine if this particular server iscurrently offline for new connections, for example, due to excessiveconnections or possibly due to a previous failure to respond to aconnection request. If not, the process ends. If so, control passes toblock 310.

[0051] In block 310, a check is made to determine if the currentconnection count for the server determined to be offline is reasonable.If not, the process ends. If so, control passes to block 312. A numberthat is considered reasonable will depend on the particularimplementation, and/or on the preferences of the system administrator,and may depend upon the underlying reason that the server is offline.For example, if the server is currently offline for new connectionsbecause it reached its maximum number of connections, the reasonablenumber may be set to be a predetermined number less than the maximum,for example, the maximum number minus two. If the server is currentlyoffline for new connections because of a previous failure to respond toa connection request, the reasonable number may be a percentage of themaximum, for example, seventy five percent.

[0052] Block 310 may also include checks of current server bandwidthconsumption. For example, if the server is currently offline for newconnections because of a previous failure to respond to a connectionrequest, a check may be made in block 310 to determine if both thenumber of connections being handled by the server and the bandwidthbeing consumed by the server are reasonable.

[0053] In block 312, a server available message is sent. This messageindicates that the server is now available for new connection requests.Following this, the process ends.

[0054]FIG. 3D is a logic flow diagram illustrating the processes andoutputs of state S2. The process begins at block 330, in which a servermessage is checked to determine if it is a connection acknowledgement.If so, control passes to block 332. If not, control passes to block 342.

[0055] In block 332, the connection counter for the server sending theacknowledgement message is incremented. Then, in block 334, a check ismade to determine if the current connection count is excessive, forexample, greater than a predetermined number. If so, control passes toblock 336, in which a server overloaded message is sent. Then theprocess ends.

[0056] A number that is considered excessive for the connection countwill depend on the particular implementation and may also depend uponthe current bandwidth consumption for the server. The server overloadedmessage sent in block 336 indicates that the server is unavailable fornew connections. This unavailability may be temporary or indefinite. Forexample, the server overloaded message may indicate removal of thisserver from contention for new connections for a specified period oftime, or the server overload message may indicate removal of this serverfrom contention for new connections until a future message indicatesotherwise.

[0057] If the current connection count was determined not excessive inblock 334, then a check is made in block 338 to determine if the serverresponse took an excessive amount of time, for example, longer that apredetermined duration. If not, the process ends. If so, control passesto block 340, in which a server overloaded message is sent.

[0058] An amount of time to respond that is considered excessive willdepend on the particular implementation. For example, if the serverresides on a fast Ethernet network, a reasonable response time may be500 milliseconds or less. Moreover, the reasonable response time may beserver dependent. For example, a load monitor may keep historicalinformation for each server that indicates how long on average eachserver takes to respond. Server overload may then be identified in block338 upon detecting a response time that is slower than the historicalaverage.

[0059] In block 342, a check is made to determine if the server messageis a data message. If not, the process ends. If so, control passes toblock 344, in which the current bandwidth consumption is recalculatedfor the server sending the message. Then, the process ends.Alternatively, no check is made in block 342, and all server messagesresult in a recalculation of server bandwidth consumption.

[0060]FIG. 3E is a logic flow diagram illustrating the processes andoutputs of state S3. The process begins at block 360, in which a checkis made to determine if there are any outstanding connection requests.An outstanding connection request is any client connection request thathas not been responded to by a server within a predetermined time-outperiod or duration. If there are outstanding connection requests,control passes to block 362. If not, control passes to block 364.

[0061] In block 362, a server overloaded message is sent for each serverhaving outstanding responses. These messages indicate that each serverhaving outstanding responses is unavailable for new connection requests,either temporarily or indefinitely. These messages also may indicatethat the outstanding connection requests are to be reassigned toavailable servers. Following this, control passes to block 364.

[0062] In block 364, a check is made to determine if there are servers,other than a server taken offline in block 362 just immediately before,that are offline for new connection requests. If not control passes toblock 370. If so, control passes to block 366.

[0063] In block 366, a check is made for each server identified in block364, to determine if the server has experienced a significant drop inbandwidth consumption. An amount of bandwidth consumption that isconsidered significant will depend on the particular implementation andfactors including network capacity and number of servers. If no offlineserver has experienced a significant drop in bandwidth consumption,control passes to block 370. Otherwise, control passes to block 368.

[0064] In block 368, a server available message is sent for each offlineserver experiencing a significant drop in bandwidth consumption. Thesemessages indicate that each such server is now available for newconnection requests. Following this, control passes to block 370.

[0065] In block 370, the bandwidth consumption for each server ischecked to determine if a server should be taken offline for newconnections due to excessive bandwidth consumption. If so, controlpasses to block 372. If not, the process ends.

[0066] In block 372, a server overloaded message is sent for each serverconsuming excessive bandwidth. These messages indicate that each suchserver is unavailable for new connection requests, either temporarily orindefinitely. Following this, the process ends.

[0067]FIG. 4 is a block diagram illustrating an example computingenvironment. An example machine 400 includes a processing system 402,which may include a central processing unit such as a microprocessor ormicrocontroller for executing programs to control tasks in the machine400, thereby enabling the features and function described above.Moreover, the processing system 402 may include one or more additionalprocessors, which may be discrete processors or may be built in to thecentral processing unit.

[0068] The processing system 402 is coupled with a bus 404, whichprovides a set of signals for communicating with the processing system402 and may include a data channel for facilitating information transferbetween storage and other peripheral components of the machine 400.

[0069] The machine 400 may include embedded controllers, such as Genericor Programmable Logic Devices or Arrays (PLD, PLA, GAL, PAL), FieldProgrammable Gate Arrays (FPGA), Application Specific IntegratedCircuits (ASIC), single-chip computers, smart cards, or the like, whichmay serve as the processing system 402.

[0070] The machine 400 may include a main memory 406 and one or morecache memories, and may also include a secondary memory 408. Thesememories provide storage of instructions and data for programs executingon the processing system 402, and may be semiconductor based and/ornon-semiconductor based memory. The secondary memory 408 may include,for example, a hard disk drive 410, a removable storage drive 412 and/ora storage interface 420.

[0071] The machine 400 may also include a display system 424 forconnecting to a display device 426. The machine 400 includes aninput/output (I/O) system 430 (i.e., one or more controllers or adaptersfor providing interface functions) for connecting to one or more I/Odevices 432-434. The I/O system 430 may provide a communicationsinterface, which allows software and data to be transferred, in the formof signals 442, between machine 400 and external devices, networks orinformation sources. The signals 442 may be any signals (e.g.,electronic, electromagnetic, optical, etc.) capable of being receivedvia a channel 440 (e.g., wire, cable, optical fiber, phone line,infrared (IR) channel, radio frequency (RF) channel, etc.). Acommunications interface used to receive these signals 442 may be anetwork interface card designed for a particular type of network,protocol and channel medium, or may be designed to serve multiplenetworks, protocols and/or channel media.

[0072] Machine-readable instructions (also known as programs, softwareor code) are stored in the machine 400 and/or are delivered to themachine 400 over a communications interface. As used herein, the term“machine-readable medium” refers to any media used to provide one ormore sequences of one or more instructions to the processing system 402for execution.

[0073] Other systems, architectures, and modifications and/orreconfigurations of machine 400 of FIG. 4 are also possible. The variousimplementations described above have been presented by way of exampleonly, and not limitation. For example, the logic flows depicted in FIGS.3A-3E do not require the particular order shown. Multi-tasking and/orparallel processing may be preferable. Thus, other embodiments may bewithin the scope of the following claims.

What is claimed is:
 1. A system comprising: a communications interfaceconfigurable to monitor messages on a network; a processor coupled withthe communications interface; and a machine-readable medium coupled withthe processor, the machine-readable medium being configured to instructthe processor to access machine-readable instructions to cause theprocessor to perform operations including, identifying a messagerequesting a connection with a data delivery system having a pluralityof nodes, initiating a tracking of a duration after the messageidentification, and identifying a node of the data delivery system asbeing overloaded based upon the tracked duration.
 2. The system of claim1, wherein the operations further include balancing load among theplurality of nodes in the data delivery system.
 3. The system of claim2, wherein balancing load among the plurality of nodes comprisesremoving the overloaded node from contention for new connections.
 4. Thesystem of claim 3, wherein removing the overloaded node comprisestransmitting a message to initiate a removal.
 5. The system of claim 2,wherein the tracked duration comprises an amount of time between themessage identification and either an identification of an acknowledgingmessage from the node or a termination of a time-out period during whichno acknowledging message is returned by the node.
 6. The system of claim5, wherein the node identifying operation comprises comparing thetracked duration to an historical average for the node.
 7. The system ofclaim 5, wherein the communications interface comprises an Ethernetcard.
 8. The system of claim 7, wherein the network comprises anInternet Protocol network.
 9. The system of claim 8, wherein the datadelivery system comprises a web server cluster.
 10. A method ofadjusting assignment of connection requests to nodes in a data deliverysystem, the method comprising: monitoring messages on a network;identifying a message requesting a connection with a data deliverysystem having multiple nodes; initiating a tracking of a duration afterthe message identification; and adjusting assignment of connectionrequests based upon the tracked duration.
 11. The method of claim 10,wherein the tracked duration comprises an amount of time between themessage identification and either an identification of an acknowledgingmessage from an assigned node in the data delivery system or atermination of a time-out period during which no acknowledging messageis returned by the assigned node.
 12. The method of claim 11, whereinthe adjusting comprises: identifying the assigned node as overloaded;and removing the assigned node from contention for new connections. 13.The method of claim 12, wherein the node-overloaded identificationcomprises comparing the tracked duration to an historical average forthe assigned node.
 14. The method of claim 12, wherein the removingcomprises transmitting a message to initiate a removal.
 15. The methodof claim 10, wherein the adjusting of assignment of connection requestsis further based upon a bandwidth consumption.
 16. The method of claim15, wherein the adjusting comprises: identifying an assigned node asoverloaded if the tracked duration exceeds a maximum time; removing theassigned node from contention for new connections if the assigned nodeis identified as overloaded; identifying a removed node as available toservice new connection requests if a bandwidth consumption for theremoved node drops; and reinstating the removed node to receive newconnection requests if the removed node is identified as available toservice new connection requests.
 17. The method of claim 16, wherein theadjusting further comprises: identifying the assigned node as overloadedif a number of connections for the assigned node exceeds a maximumconnections; identifying the assigned node as overloaded if thebandwidth consumption for the assigned node exceeds a maximum bandwidth;and identifying the removed node as available to service new connectionrequests if the number of connections for the removed node drops belowthe maximum connections.
 18. The method of claim 17, wherein the maximumtime, the maximum connections and the maximum bandwidth are eachnode-specific.
 19. The method of claim 10, wherein the data deliverysystem comprises a web server cluster.
 20. Machine-readableinstructions, embodied in a machine-readable medium or a propagatedsignal, for causing a machine to perform operations comprising:monitoring messages on a network; identifying a message requesting aconnection with a data delivery system having multiple nodes; initiatinga tracking of a duration after the message identification; and adjustingassignment of connection requests based upon the tracked duration. 21.The instructions of claim 20, wherein the tracked duration comprises anamount of time between the message identification and either anidentification of an acknowledging message from an assigned node in thedata delivery system or a termination of a time-out period during whichno acknowledging message is returned by the assigned node.
 22. Theinstructions of claim 21, wherein the adjusting comprises: identifyingthe assigned node as overloaded; and removing the assigned node fromcontention for new connections.
 23. The instructions of claim 22,wherein the node-overloaded identification comprises comparing thetracked duration to an historical average for the assigned node.
 24. Theinstructions of claim 22, wherein the removing comprises transmitting amessage to initiate a removal.
 25. A system comprising: means foraccessing messages on a network; means for storing data from themessages; and processing means for initiating tracking a duration aftera message requesting a connection with a data delivery system having aplurality of nodes, identifying a node from the data delivery system asoverloaded based upon the tracked duration, and removing the node fromcontention for new connections if the node is identified as overloaded.26. The system of claim 25, wherein the removing comprises transmittinga message to initiate a removal.
 27. The system of claim 25, wherein thetracked duration comprises an amount of time between the requestingmessage and either an acknowledging message from the node or a time-outperiod during which no acknowledging message is returned by the node.28. The system of claim 27, wherein the identifying comprises comparingthe tracked duration to an historical average for the node.
 29. A methodof adjusting assignment of connection requests to nodes in a datadelivery system, the method comprising: monitoring messages on anetwork; tracking a current number of open connections for each node;calculating a current bandwidth consumption for each node; and adjustingassignment of connection requests based upon the tracked connections andthe calculated bandwidth consumption.
 30. The method of claim 29,wherein the adjusting comprises transmitting a message to initiate anadjustment.